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Abstract 

^^ We use computational phylogenetic techniques to solve a central problem in inferential network 

^^•« I monitoring. More precisely, we design a novel algorithm for multicast-based delay inference, that 

is, the problem of reconstructing delay characteristics of a network from end-to-end delay measure- 
ments on network paths. Our inference algorithm is based on additive metric techniques used in 

Pm ' phylogenetics. It runs in polynomial time and requires a sample of size only poly (log 77). Wc also 

Ph ' show how to recover the topology of the routing tree. 

■3' 

?H ! 1 Introduction 

NetAvork tomography. Inferential network monitoring — also known as network tomography [27] — 
Cn I consists in reconstructing various properties of large communication networks from indirect measure- 
ments in order to facilitate the management of these networks. Network inference can be achieved 
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^O ■ by two general approaches. In the internal approach, one takes measurements directly at the edges 

£2 I and nodes of the network. This approach suffers from several drawbacks: the network operator may 

f— ^ ' not allow access to internal devices of the network or may not make public measurements on them; 

^O ■ the routers may not have the technological capabilities to perform the required measurements; direct 

^^ ', measurements may create extra computational burden as well as congestion in the network. This has 

r^ ' led some in the networking community to consider instead the external approach. In this case, one uses 

C^ ' so-called "end-to-end" measurements, e.g., measurements of delays or rate of packet drops between 

^ . nodes in the network, and seeks to infer the desired network properties from them. This gives rise to 

^ I an inverse problem similar to tomographic image reconstruction. 

Our aim in this paper is to propose a novel approach to this problem. We focus on multicast-based 
inference. Multicast routing consists in sending a packet from a source to a set of receivers through 
_C^ I a routing tree. The packet is duplicated at each branch point and sent further down the tree. The 
routing tree is generally unknown to the user. The idea is to use inherent correlation of measurements 
between different receivers to reconstruct the topology of the routing tree as well as to estimate link 
properties of this tree. The main link property we consider here is the delay distribution. The multicast 
inference approach was introduced in [U [23] . 

A core difficulty of the problem is to devise efficient, scalable algorithms which consistently estimate 
the desired network properties. Several techniques have been used in the network tomography liter- 
ature, notably maximum pseudo-likelihood, EM algorithms and Markov chain Monte Carlo methods. 
See [6] for a detailed survey and bibliographic references. In this paper, we introduce a new methodol- 
ogy for multicast delay inference inspired by techniques from the field of phylogenetics in biology, that 
is, the reconstruction of evolutionary trees from molecular data. Our methodology has the advantage 
of being provably consistent and computationally efficient. It also uses a small asymptotic sample size. 
This is crucial to reduce the burden on the network as well as to obtain a consistent "snapshot" of 
the network, which is intrinsically dynamic in nature. Typical networks undergo sporadic medium 



to large-scale changes in structure over time, therefore algorithms with low sample complexity are 
essential. Concurrently to our work, Liang et al. [15] used similar ideas to tackle the related multicast 
packet loss inference problem. Also, Ni and Tatikonda [19] independently proposed a Markov-based 
inference algorithm similar to ours for multicast delay inference — although our work appears to be 
the first rigorous analysis of the sample complexity of this approach. See Section 11.21 for a precise 
statement of our results and Section 11.31 for a discussion of previous work. The results detailed here 
were first announced in [5]. 

Phylogeny background. A core problem in evolutionary biology is the inference of evolutionary 
histories of organisms from molecular data. Evolution is usually represented by a tree where branching 
points indicate speciation events. The root of the tree is the common ancestor to all species in the tree 
and the leaves are contemporary (extant) species. Molecular data is assumed to evolve according to a 
standard Markov model. The phylogenetic reconstruction problem is the following. From measurement 
of sequences of molecular data at the leaves, one seeks to reconstruct the topology of the evolutionary 
tree as well as mutation characteristics along the branches. See |llj and [25] for an overview of the 
field of phylogenetics. 

Various statistical and computational techniques have been used to solve the phylogenetic recon- 
struction problem: maximum likelihood, bayesian, parsimony, and distance-based methods. In this 
paper, we adapt and extend distance-based techniques to deal with a class of models introduced in [23] 
in connection with the multicast network inference problem — this new class of models is similar to the 
Markov models used in phylogenetics but presents challenges of its own. The main idea in distance- 
based methods is to define a so-called tree metric from mutation parameters. A tree metric is a metric 
on the leaves of the tree which can be realized as a path metric on a corresponding weighted tree. (See 
Section [2] for more details.) After being estimated, the metric allows the reconstruction of the tree 
and its characteristics. A main advantage of this approach is that it leads to computationally efficient 
algorithms with provable sample requirement guarantees. 

1.1 Basic Definitions 

A broadcasting process on a tree. We now give a more formal statement of the multicast inference 
problem introduced in [23]. Let T = (V, E) be a tree on n + 1 leaves L — representing the routing tree — 
and let {de}e&E be a set of independent positive random variables on the edges — representing the 
delays. Leaf 0, the source., is the root of the tree. The remaining n leaves are the receivers. We assume 
that all internal nodes have degree at least 3. 

A realization of the multicast delay process works as follows: the root sends a packet to the receivers 
through the routing tree; at every branching point, the packet is duplicated; on every link e, an 
independent random delay de is experienced by the packet. More formally, we define the multicast 
delay process {Du}u&v as follows. Let Pij be the path (set of edges) between nodes i and j in T. For 
a node u, let 

eePou 

Note that D^ is the total delay at node u in the network. 

The multicast inference problem. The tree and delay distributions are actually unknown to us. 
We are only given access to k independent samples of delays at the leaves {D\}a&L, ■ ■ ■ , {Da}aeL- Our 
goal is to reconstruct the routing tree and estimate the delay distributions using these samples. We 
now define more precisely what we mean by the estimation of the delay distributions. In this work, we 
assume that each edge delay distribution (in general, different) is characterized by a constant number, 
say J — 1 > 1 (independent of n), of consecutive central moments. That is, we assume there are 
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for all e (z E and 2 < j < J. Our goal is to estimate these moments within a fixed accuracy. More 
formally, we make the following assumption. We first need a definition. 

Definition 1 (Regular Families) Let e > and J > 2 be fixed. Let Q = {Qejeee ^6 ^ family of dis- 
tributions on M parametrized by 9 ^ Q where Q is a subset of an Euclidean space. Let {^ (^)} r2< ■< ix 
be the first J — 1 central moments of Qg. We say that the family Q is (e, J)-regular if there exists a 
map ^ from R"^"^ to Q and a 6 > Q such that if the vector w = {f^ }|2<<ji. satisfies 



^(i) _^(i)(^) 



<6 



for all 2 < j < J, then 

\\Qe - Q*(w)||i < £• 

In Appendix [Al we give simple examples of regular families. 

Assumption 1 (Regularity and Boundedness) Let e > and J > 2 be fixed (independent of 
n). We assume that all edge delay distributions are from a fixed (e, J) -regular family of distributions. 
Furthermore, we assume that the delays are uniformly bounded, namely there is a constant M > 
independent of n such that for all e (z E, de (z [0, M]. 

This framework is simple enough to be tractable yet general enough to accommodate large classes of 
distributions: parametrized distributions, e.g., beta distributions; and nonparametrized distributions, 
e.g., discretized distributions on {0, 1, . . . , M}. Further we need the following assumption. 

Assumption 2 (Lower Bound on Second Moment) We assume that there is a constant / > 
(independent of n) such that for all e G E, 

n^f^ > f. 

To sum up, the multicast inference problem is defined as follows. 

Definition 2 (Multicast Inference Problem, Moment Version) Let e > and J > 2 be fixed. 
The multicast inference problem consists in the following. Let T and < We t be any tree 

I J {eeE,2<j<J} 

(with internal degrees at least 3) and set of central moments on edges. Given samples of delays at the 
leaves, we are required to: 

1. Tree Reconstruction. Recover T . 

2. Moment Estimation. Estimate all characteristic moments \ We <■ within e. 
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e&E,2<j<J} 

Remark 1 As noted by Lo Presti et al. ]23^ . the means of the edge delay distributions are, in general, 
unidentifiable. See Figure [I] for an illustration. In particular, one cannot hope to recover the deter- 
ministic transmission delay on each link. But, as noted in 123^ . this is not a major issue. Indeed, in 
practice, one is only interested in the variable portion of the delay, that is, the portion resulting from 
traffic. To restore identifiability, Lo Presti et al. proceed by subtracting the lowest observed delay on 
each receiver, in order to remove the (estimated) deterministic component of the delay. They further 
assume that the variable portion of the delay "starts at 0. " We also make this last assumption (see our 
examples of regular delay distributions in Appendix lA\) . However, instead of subtracting the minimum 
observed delay (which may be unreliable on a large network), we use central moments — which are not 
affected by the deterministic transmission delay. 




Figure 1: Unidentifiability of Mean Delay: If one were to replace di with di + fi and d2,d^ with 
d2 — fijd^ — fi for /i > (assuming fi can be chosen so that all delays remain positive) then the 
distribution of delays at a, b would be unchanged. This example also shows that one cannot deduce 
the delays on all edges given total delays at all leaves. 



1.2 Our Results. 

Our main result is the following theorem. 

Theorem 1 (Main Result) Let e > and J > 2 be fixed. Let Assumptions\^ and\E hold. Then, 
there is a polynomial-time algorithm which solves the multicast inference problem, with high probability 
using k = O (poly(log?i)) samples. 

See Theorems El m and [5] below for more precise statements. 

The proofs of the main theorems rely on the important notion of a tree metric from phylogenetics. 
Roughly speaking, a tree metric is a metric on the leaves of a tree such that the distance between 
any two leaves can be written as a sum of edge weights on the corresponding path. (See Section [2] for 
definitions.) There are two components to our algorithm: 

1. Topology reconstruction: The reconstruction of the routing tree can be achieved by adapting 
known phylogenetic reconstruction algorithms — once the proper delay-based metric is defined. 
This result is proved in Section [3l The relevant phylogenetic background is introduced in Sec- 
tion El 

2. Moment estimation on edges: Most of the technical work of this paper is in deriving and 
analyzing a metric-based algorithm for inferring edge delay distributions (Theorems [H and [5l). 
For this purpose, a) we relax the notion of a tree metric to allow nonnegative edge weights, b) 
we define appropriate delay-based metrics, and c) we show how to estimate these metrics. The 
analysis relies on large deviations arguments. 

As far as we are aware, our algorithm is the first multicast inference algorithm to be both provably 
efficient and consistent. Previous work concerned mostly non-rigorous techniques such as maximum 
pseudo- likelihood and EM algorithms. See [6] for details. An exception is the independent, unpublished 
work of Liang et al. [M] which uses techniques similar to ours in the related context of multicast packet 
drop inference. 

1.3 Discussion 

Validity of assumptions. The multicast delay process defined in Section 11.11 relies on two basic 
assumptions about routing and traffic which makes its analysis possible: temporal and spatial inde- 
pendence. In reality, of course, both assumptions are violated to some extent. Lo Presti et al. [23] 
(see also [3]) studied the effect of these violations empirically and concluded that the multicast delay 



process is a useful first approximation to the underlying complex process. We briefly summarize their 
findings. 

Temporal dependence — delays at a given link being correlated at different points in time — is common 
in communication networks. But, as it turns out, its impact is rather mild for our purposes. Indeed 
the type of inference procedure studied in [23] (as well as in the current paper) does not actually 
require independence in time but only ergodicity — a much weaker assumption; more precisely, the 
estimator in [53] (and in the current paper) is consistent as long as the delay process is ergodic. Hence, 
the temporal dependencies impact only the convergence rate of the inference procedure. Lo Presti 
et al. showed empirically that, although this effect cannot be ignored, it is rather mild. Quantifying 
exactly the effect of temporal correlations on the theoretical convergence rate of an estimator is non- 
trivial. 

As for spatial correlations — dependencies in delays on neighboring links — Lo Presti et al. found 
that they can produce a systematic bias in the estimation. However, they showed empirically that 
the bias is a small, second-order effect, possibly — they argue — because the diversity of traffic on the 
network results only in localized, short-term correlations in delays. They also point out that very little 
is known about the precise structure of such spatial correlations in real networks, making it hard to 
derive a good model for them. 

Another assumption implicit in our model is that the process, including the routing tree itself, 
remains homogeneous over time. In fact, there are sporadic large-scale changes in the network. These 
explain why a low sample complexity is critical for an inference procedure to be useful in practice. 
Minimizing the sample complexity is the main focus of this paper. 

Related results. The multicast delay inference problem was formalized by Lo Presti et al. in |23j . 
In that paper, the authors give a procedure to infer a discretized delay distribution on each link, given 
the routing tree topology. Their algorithm is based on an ad-hoc fixed point equation that is solved 
by least squares. Moreover, these authors show that their estimator is asymptotically normal with a 
variance-covariance matrix depending implicitly on the delay characteristics. More explicit formulas 
are given in the limit of small delays. The algorithm is tested on small networks and the dependence 
on the size is not given. 

More recently, Ni and Tatikonda [HI [20l [2T1 [22] — in work subsequent to ours [2] — used phylogenetic 
techniques to recover the routing tree topology in this context. Similarly to the current paper, they 
use distance-based techniques. The basic algorithm they consider is the well-known Neighbor-Joining 
(NJ) algorithm which they apply to various tree metrics, for instance, the delay variance metric (as 
we do here). They also deal with trees of internal degrees higher than 3 by introducing a variant of 
NJ called Rooted Neighbor-Joining (RNJ) [21] (based on a technique equivalent to what is known 
in phylogenetics as the Farris transform [lOj). They show more precisely that RNJ is a consistent 
estimator of the routing tree, but no convergence rate is given. Note, however, that RNJ has in fact 
a high sample complexity due to its reliance on the diameter of the tree. See, e.g., [1]. See also our 
discussion about diameter v. depth in Section 12.21 Here, we make use of state-of-art phylogenetic 
reconstruction techniques to derive a low sample complexity algorithm for routing tree reconstruction. 
We also show how to infer delay distributions. A technique to infer discrete delays was also subsequently 
obtained by Ni and Tatikonda [20] (although no convergence rate is provided). 

A related network tomography problem is the so-called multicast link loss inference problem, where 
one observes packet losses at the receivers of a multicast routing tree — instead of delays — and seeks to 
infer the routing tree and packet drop probabilities on the links. This problem was formalized in [4] 
where a maximum-likelihood estimation procedure was analyzed. In [3], the network topology is as- 
sumed known. In more recent independent work, Liang et al. [15] (unpublished) applied phylogenetic 
techniques to the inference of the routing topology in this context. Indeed, the multicast link loss 
problem is in essence a special case of the standard model of DNA evolution used in biology. Similarly 
to the current paper, Liang et al. use distance-based techniques. More precisely, they give a com- 



putationally efficient reconstruction algorithm with sample complexity 0(b~^ log n) where b (possibly 
depending on n) is a lower bound on the link loss probability. Ni and Tatikonda |191 \2U[ [2H [22] (see 
above) also considered the link loss inference problem. 

1.4 Organization of the Paper 

The paper is organized as follows. We start with some phylogenetic background in Section [2l Our 
results concerning the topology reconstruction can be found in the Section [3l We then present and 
analyze our delay inference algorithm in Section HI 

2 Phylogenetic Reconstruction Techniques 

In this section, we summarize and adapt to our setting the DMR algorithm of [8]. 

2.1 Basics 

We begin with a few basic notions from phylogenetics. 

Tree metrics. In phylogenetics, the notion of a tree metric is useful for reconstructing the topology 
of phylogenies. We use the notation M++ = {x £ M : x > 0}. 

Definition 3 (Tree Metric) Let L be a finite set with cardinality n. A function W : L x L ^ M_|_ 
defines a (nondegenerate) tree metric if the following holds. There exist a tree T = {V, E) with leaf set 
L and a weight function w : E ^ J^++ such that W{a, b) = X^ggP ^e for all a,b £ L where Pab is the 
path between a and b in T. 

Tree metrics are usually estimated from samples of the tree process at the leaves. In that context, 
Azuma's inequality is useful (see, e.g., [18]). 



Lemma 1 (Azuma-Hoeff"ding Inequality) Suppose X = [Xi^ . . . ,Xk) are independent random 
variables taking values in a set S, and f : S^ ^ M. is any t-Lipschitz function: |/(x) — /(y)| < t 
whenever x and y differ at just one coordinate. Then, VA > 0, 



P [fix) - E[fiX)] > A] < exp 

and 

F[f{X)-E[fiX)]<-X]<exp 
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2t2/fc 



2t^k 



Bipartitions. A useful combinatorial description of a tree T = (V, E) is obtained by noticing that 
each edge e G -E of the tree naturally corresponds to a partition of the leaves L into two subsets (that 
is, the leaves on either "side" of e). Such partitions are called bipartitions and they characterize the 
tree: it is easy to generate all bipartitions corresponding to a given tree, and on the other hand, there is 
a simple efficient iterative procedure to recover a tree from the set of all of its bipartitions. See |1H ^^ 
for details. 



2.2 Distorted Metric Algorithms 

Classical distance-based reconstruction algorithms (that is, those methods based on tree metrics) such 
as UPGMA [26] or Neighbor- Joining (NJ) [23], typically make use of all pairwise distances between 
leaves. This leads to difficulties because "long" distances are more "noisy" and require a large number 
of samples to be accurately estimated. For instance, in the phylogenetic context, the widely used NJ 
algorithm is computationally efficient, but it is known to require exponentially many samples — even 
for simple linear trees |13| . 

An important breakthrough was made in [9] where it was shown that it was in fact enough to 
use "short" distances to fully recover the tree under reasonable assumptions. To help understand this 
result, we need a notion of tree "depth." Given an edge e £ E, the chord depth of e is the length (in 
graph distance) of the shortest path between two leaves on which e lieo That is, 

A{e) = nim{d{u,v) : u,v £ L,e £ Puv} , 

where d is the graph distance on T. We define the chord depth of a tree T to be the maximum chord 
depth in T 

A(T) =max{A(e) : e £ E} . 

It is easy to show that A(r) < log2 n if the degree of all internal nodes is at least 3 (argue by 
contradiction). In a nutshell, the key insight behind the results in [9] is that the diameter and the 
depth of a tree behave very differently: even though the diameter can be as large as 0{n), the depth 
is always O(logn), in other words, each edge lies on a "short" path between two leaves. Using clever 
combinatorial arguments, Erdos et al. [9] showed that one can reconstruct trees with much fewer 
samples by ignoring those distances corresponding to paths longer than O(logn). 

More recently, Daskalakis et al. [8] relaxed some of the assumptions in 0. In particular, they gave 
a reconstruction algorithm based on short distances allowing internal degrees bigger than 3 — which is 
particularly relevant in the networking context. Their algorithm, which we will call the DMR algorithm, 
reconstructs all bipartitions using only distances smaller than a threshold of order O(logn). To check 
that the algorithm works, one only needs to show that such distances are accurately estimated for a 
given number of samples. In the tomography setting, the DMR algorithm will allow us to reconstruct 
the routing tree using as few as poly log n samples (see next section). The details of the algorithm are 
sketched in Appendix [Bj 

We now state a corollary of [8] that will be useful to us. We first need the following definition 
which formalizes the idea that short distances are accurately estimated (and that long distances can 
in some sense be ignored). 

Definition 4 (Distorted Metric [16L I12j ) Let T = (V, E) be a tree with leaf set L and edge weight 
function w : E ^ ^++- Let W : L x L ^ M_|_ be the corresponding tree metric. Fix f, M > 0. We say 
that W : L X L ^f {0, +oo] is a (f, M) -distorted metric for T or a (f , Af) -distortion of W if: 



1. 



(Symmetry) For all u,v £ L, W is symmetric, that is, 

W{u,v) =W{v,u); 

2. (Distortion) W is accurate on "short" distances, that is, for allu,v £ L, if either W{u,v) < M+f 
or W{u, v) < M + f then 



W{u,v)-W{u,v) 



< T. 



^Note that unlike we use the graph distance in the definition of chord depth. Because of our assumptions (see 
below) the two graph and weighted distances are the same up to a constant factor. Note also that we are using a different 
definition than [9]. But again the difference is only a constant factor. 



Let /, (7 > be bounds on the edge weights, that is, f < We < g for ah e G E. We say that such an 
edge weight function satisfies the (/, g')-condition. 

Theorem 2 (DMR Algorithm ^) Let < f < g < +00, a < 1/6, and (3 > 2. There is a 
polynomial-time algorithm A such that, for all trees T = (V, E) with edge weight function w satisfying 
the {f,g)- condition and all {af, PgA{T)) -distortions W ofW (where W is the tree metric corresponding 
to w), A applied to W returns T. 

Note that the previous theorem is a deterministic statement about distorted metrics. We show how to 
estimate such a distorted metric from random samples with high probabihty in Section 13. 2i 

3 Routing Tree Reconstruction 

The goal of this section is to reconstruct efficiently the topology of the routing tree using Theorem [2j 

3.1 Variance Metric 

From Definition [3l one can define a tree metric by first choosing a tree — in our case, the routing tree — 
and then defining a weight function on its edges. Any positive quantity can serve as a weight. The 
important point is that one must be able to estimate the resulting tree metric from samples at the 
leaves. This governs the choice of the weight function. 

Let T = iy, E) be the (unknown) routing tree with leaf set L and consider the choice of weights 

wf^ =Var[4], 
for all e G -B and the corresponding tree metric 

for all a,b G L. Our first task is to check that this metric can be estimated from samples at the leaves. 
Let a, b be leaves and consider the quantity S^j^ = Yai[Da — Dh] (where recall from ([1]) that Du is the 
delay at u). The delays Da and Dfj are observed at the leaves a and b respectively and therefore the 
variance of Da — D^ can be easily estimated. Moreover, we claim that the equality 5^^ = W^'^'{a,b) 
holds. Indeed, denote 7^^ the common ancestor of a and b, that is, the node at which all three paths 
-fafe) -Poa; and P06 intersect (where we assume a, 6 / 0). Then, by independence of the edge delays, we 
have 



J^J = Var[Z?, - Dk] = Var 






Y^ Var[de]+ Yl Var[(ie] = H^(2)(o,6). 



Therefore, we can estimate W^'^> by estimating 5^"^' at the leaves. 

le the standa: 

1 ^ 

1=1 

i=l 

well concentrated arou 
Hoeffding inequality (see Lemma [1]). The next lemmas provide the necessary Lipschitz condition. 



(2) 
To estimate (5^^ from k samples, we use the standard unbiased estimator for the variance of Da — Db 

1=1 

where 

k 

k 
i=l 

^(2) (2) 

Below, we will need to show that 5^^ is well concentrated around (5^^ , which follows from the Azuma- 



Lemma 2 Suppose X = (Xi, . . . , Xk) are independent random variables taking values in [—B, B] with 
k > 2. Then, the variance estimator 



»i = ^E(■'^<-^)^ = ^E(^.-^.)^ 



where X is the sample average, is —^-Lipschitz. 

Proof: Let X be as above and let Y differ from X in one coordinate. Then 

14-41 < ha:^^I(^'-^^)'-(^^-^^)'I^^^'- 



We tfien get immediately the following. 

Lemma 3 (Lipschitz Constant for Delay-based Metric) Say 5)^^^ is computed with k samples. 

2 n,f 2 



Then, 5 J is then ' "''J — ■ — Lipschitz. 



3.2 Inferring the routing tree 

Equipped with a legitimate tree metric, we use the DMR algorithm to infer the topology. Here, we 
use Theorem [2] to prove that the routing tree can be inferred with poly log n samples at the leaves. 
This is our main result for this section. The main technical difficulty (unlike the phylogenetic case) is 
in controlling the deviation of "long distances." (See second part of the proof.) Fix a, /3, /, g as in 
Theorem [21 Note that by assumption we have 

/ < wP < g, (2) 

for all e with 



Theorem 3 (Efficient Network Inference) Let T = {V, E) he the (unknown) routing tree where 
edge delays satisfy Assumptions\^ and{M Consider the tree metric W^'^' = 6^'^' and assume that the 
estimate W^^' = S^'^' is computed using k samples at the leaves. Then, DMR returns the correct 
topology for T with probability 1 — o(l) if k = 0(log n) (where the constant factor depends only on 
f,g), as n tends to +cxd. 

Proof: Assume k is as stated above. We apply Theorem [2] and therefore only need to show that VF'^^ 



is a (q/, /3g'A(T))-distortion of W^"^^ when k = Vl{\og 



^ n] 



Part 1. First, we must show that distances smaller than /3gA(T) + af under W^'^' are approximated 
within af. For reasons that will become clear below, we show instead that distances smaller than twice 
that amount are well approximated. Let a,bhe a pair of leaves at distance at most 2f3gA{T) + 2a/. 
Let A be the probability that, for all such pairs, W^"^' is approximated within af. By our assumption 
([2]), the number of edges on the path between a and b is at most 

|P„,|<(2/3A(T) + 1)^, 



where we used that f < g and a < 1/6. By Lemmas [T] and [3l we have 

{affk 



r(2) 



*i? 



>af 



< 2exp 



2[4(2/3A(r) + l)2|^M2]2 I poly(n)' 



from A(T) = O(logn), k = i7(log n), and the fact that f,g,M are constants. The notation poly(n) 
means 0{n ) for a K as a large as we need as long as the constant factor in k is large enough. Since 



there are at most n such pairs of leaves, we get A < 



poly(n) • 



Part 2. Let a,b he a pair of leaves at distance at least 2f3gA{T) + 2a f under W^'^'. We now show 
that, for all such pairs, W^^' is at least (3gA{T) + af. Let B be the probability of that event. Note 
first that from Azuma-Hoeffding (Lemma [T]), it follows that for any pair of leaves a,b, 

1 

poly(n) 

Let £ be the event that the inequality in square brackets in ([3|) holds for all k samples used to compute 

f(2) 



\{Da - Db) - E[Da -Df,]\< ^AP^\ 6(0^ 



n] 



> 1 



(3) 



6^jj . Then from Lemma [21 on £, the Lipschitz constant of 6^^^ (as a function of the centered samples 
{Di-Di)-E[Da-Dh])ist 



\Pab 



?(2) 



6';' <PgA{T)+af 



^—Q (logn) and therefore, by Lemma [T] again, 



£ 



< 



< 



J(2) < E[g' 

ab — o 



£ 



?{2) 



m:b'\s]-^, 



(2) ^ nsiS\£] 



ab 



> 



£ 



?(2) 



< exp 



2t'^k 



< 



< 



k 



exp 



O(log^n) 



poly(n) ' 



where we used 6 , = 0(1^^6 1) ^^d 



{l-o{l))6^^<E[6^'^\£]<{l + o{l))5l'^ 



ab ' 



rx(2)i 



^(2) I 



which follows from E[5l^f^> \£]F[£] + E[(5^^Jf ^]P[f "] = E[6'']^], E[6'f^] 



*!?. PPI < l/poly(n), and 



0(n2). Therefore, we have B < l/poly(ri). 



Combining the two parts of the argument, we have shown that, except with o(l) probability, W^'^' is 
a (d/,/3gA(r))-distortion of VF^^)^ indeed, by Part 2 the pairs of leaves for which ^"(2) < (3gA{T) + af 
must have W^'^' < 2(3gA{T) + 2a f and such pairs satisfy the approximation guarantee required by the 
definition of a distorted metric by Part 1. Moreover, Part 1 implies in particular that pairs of leaves 
such that 1^(2) ^ j3gA{T) + af also satisfy the approximation guarantee. ■ 

4 Edge Delay Inference 

In this section, we show how to estimate the characteristic moments of edge delays. In Section [3l we 
showed how to reconstruct the topology efficiently with high probability (see Theorem [3|). Therefore, 
along with Assumptions [D and [21 we make the following assumption. 
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Assumption 3 (Correct Reconstruction of Routing Tree) We assume that the routing tree was 
correctly estimated. (This is true with high probability by Theorem, O j 

Our general idea to recover delay distributions is to define so-called "additive functions" whose 
edge weights are moments of delays. Then we use the AFI algorithm below to recover the moments 
efficiently from the data at the leaves. As it turns out, even moments are rather straightforward to 
estimate inductively while odd moments are trickier. Also, as in the tree reconstruction algorithm (see 
also OUT]), the AFI algorithm uses only "short" paths during the estimation process, which allows a 
significant reduction in the sample size (see Propositions [H [2] and Theorems [U [5] for details). 

4.1 Additive Functions 

In the remainder of this paper, we use additive metric-type ideas to estimate moments of edge delays. 
For this purpose, we need to recover edge weights from appropriately defined tree metrics. In fact, 
we use a notion of "generalized" tree metric which is useful in treating odd moments. This definition 
allows for negative edge weights. 

Definition 5 (Additive function) A function on the leaf set of the tree W : L x L ^ M is called 
an additive function on the leaves if there exists weights W(> £ M on each of the edges (not necessarily 
positive), such that for all leaves a, b 

W{a,b)= Y, We. 

Suppose we are given access to an additive function W on the leaves. Our goal is now to recover 
the We's from the function W , assuming further that we are given the tree T. For this purpose, we use 
a standard algorithm from combinatorial phylogenetics — related to the so-called Four-Point Method of 
Buneman [3j (see also [HI [25]). We will refer to this algorithm as the Additive Function Inference 
(AFI) algorithm. See Figures [2] and [3l 



Algorithm Additive Function Inference 
Input: tree T, function W at the leaves; 
Output: edge weights ife, for all e G i?; 

• For all internal edges e, 

— Let 5*1 , . . . , S'4 be the four subtrees hanging from e as in Figure [Sj 

— For each Si, compute Ui the closest (in graph distance) leaf to the root r^ of Si; 

— Compute 

We = -iW{ui,U3) + W{u2,Ui) -W{ui,U2) -W{u3,U4)). 

• For all leaf edges e, 

— Let e = (a, v) with a a leaf, 

— Proceed as above where U3 and U4 are set to a. 



Figure 2: Algorithm Additive Function Inference. 

4.2 Delay-based metrics 

Let T = {V, E) be the routing tree with leaf set L and consider again the choice of weights 

^/;(2) =Var[4], 
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Figure 3: Edge weight inference. 



for all e G i? and 



for all a,b € L. Recall that 



6'^^ = Y^T[Da - Dh] = Var 



eePab 



(2) 
e ; 






Y^ Var[4]+ Yl Var[de] =VF(2)(a,6). 



eePa 



'lab 



eGP-y ,b 

<ab 



w. 



(^■) = E 



[deY 



(2) ^(2) 

Therefore, using the AFI algorithm, we can recover estimates of the We 's from the 5^^ 's. 
More generally, we let 

for all e E i? where 
Also, let 



de= de-E [de] . 



eePab 

for all a,b & L. Let 

Da = Da-E [Da] , 

for all a & L. Again, to obtain W^^>{a, 5), we seek to use the quantity 



^i?=E 



[Da - DbY 

for j > 1, which can be estimated from the samples using 



4=1 
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where 

k 

1=1 



€ = rE(^^-^^)- 



k^^ 



As Lemma H] below shows, this can be done inductively. However, the lemma also shows that odd 
moments have to be treated more carefully. 

4.3 Algorithm for Moment Inference 

We first need the following definitions. Let a, b be leaves and j G N. We use the notation [h] = 
{0, . . . ,h} for /i G N. Recall that jab is the most recent common ancestor of a and b in the tree. 
Denote v = \Pab\, en = \Pa^^J, and (3 = |-P7„,,6|, and define 

V,{a, 6) = I (x, y) e [j - If x [j - if : J2^^ + Y.y^ = j] ■ 



For (x,y) G Vj{a,b), let 



j! 



.x,y; nti^^!nf=i: 

and consider the function 



"Wf 

ji 



(x,y)Gl?j(a,6) ^ ' "^ ^ j=l i=l 

where P<j^^, = (ei, . . . , e„) and P^^^fc = (/i, . . . , /^). 

Lemma 4 Xei j G N and define the function J^j : L x L ^M. as above. Then, 
1. we have for all a,b & L 

^'^i-^M^b) = j^w^) + {-iyY.wf, (4) 

S. in particular, if j is even, we have for all a,b a L 

6l^,^-T,{a,b) = W'-^\a,b). (5) 

Proof: This follows immediately from a multinomial expansion. ■ 

The important point to note in ([5]) is that J-'j{a,b) depends only on delay moments of order strictly 
less than j and that J^'^ can be estimated from samples at the leaves. Therefore, if j is even and if 
we have estimates of all edge delay moments of order up to j — 1, we can estimate W^^'{a, b) by ([5]). 
Using the AFI algorithm, we can then get an estimate of the j-th moments We ■ However, if j is odd, 
the coefficient (—1)-' in Q precludes the use of this procedure. Lemma O below shows how to handle 
this case. We note in passing that Lemma H] above is sufficient for delay distributions symmetric about 
their mean. Indeed, in that case, all odd central moments are zero and one can use ([5]) recursively to 
estimate all even characteristic moments. See Figure |H 

We now tackle odd moments. A proper estimation procedure follows from the next lemma. We 
first need a few definitions. For a,b £ L, and 1 < i* < a, we let 

£:f (a,6;r) = |(x,y)G[i-l]"x[i-lf : f;x, + f]y,=j, x,, > l\ . 
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Algorithm Symmetric Edge Reconstruction 

Input: data {Dl}a£L, ■ ■ ■ , {D^}aGL at the leaves; topology T; 

Output: estimated characteristic (even) moments We' for all e 


G -B and 2 < J 


<J 


even; 


• Initialization: set all estimates of odd moments to 0; 










• Main Loop: For all 2 < j < J even, 












- For aU a,be L, 












* Estimate S'^Jfj ; 












* Estimate jTj (a, 6) with 












J^,(a,6) = 


^-^ Vx,y/ 


11 -ir 

4=1 


IK- 

4=1 


i)y'wf\ 


* Compute 


W<~'Ha,b)=S<f, 


-^j(a 


h) 






- Use the AFI algorithm on W^^ 


(a, b) to recover all 


wi'^'s. 









Figure 4: Algorithm Symmetric Edge Reconstruction. 



and 



ef(^M=E E 



x?. 



i*=l 



(x,y)e£:^J(a,fe;i*) 



i-i 






Vin^'yVi) 



'W 



U 



where we use the notations of Lemma [H Similarly, for 1 < i* < /?, we define £^ {a, b; i*) and Q, (a, 6) 
by interchanging the roles of x and y. Our next definition requires a few combinatorial notions. Recall 
the definition of quartet split from Section [5J Let a, h, c be any leaves in a rooted tree T with root 
(which is also a leaf). We write ab\c if ah\d) holds in T. Then, for all leaves a, &, c 7^ with ab\c, let 



ab\c 



E 



[Da-DbY \Da + Db-2D,) 



Lemma 5 Let j G N. Then, using the notations above, we have for all a,b,c £ L 



H^(^-) {a,b)= 4>'^^l - [gf ) (a, 6) + of^ (a, b) 



(6) 



Proof: We write 
E \(Da - DbY"^ [Da + Db- 2D,) 



E 



(Da-DbY UDa-D, 



+ E 



(Da-DbY '{Db-D, 



Let (as in Figure [5]) 



Hi= Y, de H2= Yl de Hs= Yl ^^ H^= Y. ^"- 



e&Pc 



"Tab 



eGPfc. 



Tab 



eeP- 



Tac7a6 



eePc 
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Figure 5: The -ffj's are centered sums of delays on the corresponding paths. 



Note that all these random variables are independent and have mean. Then 



E 



{Da-DbY \Da-D,) 



= E 


\h, - H^y-^ (Hi + Hs- H^) 




= E 


\Hi - H2y-' (Hi) 


= E 







E 4'^+Gf(^'^y 



edPa 



lab 



Similarly, 



The result follows. 



E 



Da-D,Y \d,-D,) = Y. "^i'Vaf(a,6). 



eePf 



^tab 



t(i) 



(2), 



Again, the key point in ^ is that Q- {a,b) and Q- {a,b) depend only on moments of order strictly 
less than j and that (j) J, can be estimated from samples at the leaves. The algorithm for the general 
case is detailed in Figure [61 We use the plugin estimator for <j) / , 



lU) _ 1 Y-//ni T^i\ aW^-''^^ 



ab\c h. 



lEm-D^-^i^ 



'Dl-Dl)-SW + {Di-Dl)-5-, 



i=l 



5 Analysis of the ER Algorithm 



x(i) 



We start with the analysis of the symmetric case. 

We begin with a concentration result for the estimate 6^^^' . For convenience, we assume M > I. 
(This can always be obtained by rescaling.) Recall the definition of the depth of T from Section [2] and 
remember that A(T) = 0(log n) if the degree of all internal nodes is at least 3. The dependence of our 
bounds on the depth of the routing tree explains the importance of using short paths in the estimation 
procedures. 
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Algorithm Edge Reconstruction 

Input: data {Dl}a£L, ■ ■ ■ , {D^}aGL at the leaves; topology T; 

Output: estimated characteristic moments We' for all e e E and 2 < j < J; 






• Initialization: set all estimates of first moments to 0; 






• Main Loop: For all 2<j<J, 






- For all a,be L, 






* Pick the closest leaf c above ^ab 






* Compute (t>)^jli^, the plug- in estimator for (f>)^u^.', 






* Estimate G) (a,^) with 








'Hi- 

1=1 


^irwf\ 


and similarly for Qj (a, 6); 
* Compute 


,b)), 




— Use the AFI algorithm on W^^^a, b) to recover all We s. 







Figure 6: Algorithm Edge Reconstruction. 



Proposition 1 Let a,b £ L at graph distance less than 2A where A = A(r) is the chord depth ofT. 
Fix j G N. We have the following (where the constants depend on J and M only): 



1. There exists a constant C such that, VA > 0, 



?(i) 

'ab 



E 






> A I < 2 exp 



\^k 



CA2i- 



(7) 



2. There exists a constant C such that 



E 



"aft "ah 



< C 



, A^- 



A:i/2' 



whered'^^ =E[Da-Db]. 
3. There exists a constant C" such that, if k > A^, 



E 



"ab 



r(j) 



<c" 



,m2jaj+i 



Vk 



(8) 



(9) 



4- If further 



then we have 



C"""T' < A, 



Vk 



ilS-^T 



>2A 



< 2 exp 



X^k 



CA2i-i / • 



(10) 
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Proof: 1. We use Azuma's inequality (see Lemma [T]). Let 

where -D^ is the i-th delay sample at node u. Because \Pab\ ^ 2A and de £ [0, M] for all e, it follows 
that 

|/Ci| < 4MA. 

Then let 

and let C be the same quantity when an arbitrary dl is perturbed by 5 with \5\ < M (where dl is 
the i-th delay sample on edge e). Without loss of generality, assume the perturbation is in the first 
sample. Then, 



(11) 



Now expanding (fTTj) . we get 

\C -C'\ < ^ (2^{4MAy~^M + ik-l) (2^{4MAy~^^^^ 

for some constant C depending on M, J. Noting that C depends on at most 2Afc random variables d^, 
we get the result by an application of Azuma's inequality (for a different C). 
2. Note that 

-^ ~ "ab "ab ' 

is a — ^ Lipschitz function of {-D^ — D^jjgrfci thus we have by Azuma's inequality 



*ii'-ii;' 



> A 



-^'"'"''[-SM^) 



Now we use the fact that for a positive random variable Y, 

/•CO 

E [Y^] =j X^-^F{Y > A)dA. 
Jo 

If y = |,5« - 5« I and -0 = 8MW' ^e have 

. /.+00 /o)i.r2 a2\ j72 

That proves 2 (for a different C"). 
3. We have 

k . ^ k 



s = iE((oi-o»-iii')'45:((«;-"s-*i;vwi'-^ii'^^' 



j=i i=i 
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Now expand using the binomial theorem and take expectations to get 



E 



i'£ 



r{i) 



< -E 
k 



k i-1 

EE 



j^l \^ci ^b "ab j l^afe "ab ) 



< CUM Ay max <^ E 
o</i<i-i ' 



"afe "ab 



j-h 



Note that by fc > A^ , it follows that the maximum is attained at /i = j — 1 in (l8|) . 
4. This follows from 1. and 3. ■ 

We then get the main theorem in the symmetric case. Recall that J = 0(1) and that, in general, 
A = O(logn) where n is the number of leaves. 

Theorem 4 Let e > be arbitrarily small. If k = u;(A^'^ logn), then after an application of SymER, 
one has 



m - <5(i) 



<£, yee E, VI < j < J 



> l-o(ll 



(12) 



as n ^ +00. The algorithm runs in time 0{A n ). 



Proof: Let {a,b) £ L x L be called a short pair if a, b are at graph distance at most 2A. Denote S be 
the set of all short pairs. Let 



ia,b)eS 



W'-^\a,b)-W^^\a,b) 



and 



max (Tj. 



It follows immediately from the application of the AFI algorithm that 



max 

eG-E 



-(j) (j) 



< 2a,. 



Therefore, it suffices to prove 



Sj = o(l), 



with high probability as n tends to +oo. 

Further, assume we have a uniform bound 



max max 
^<j<J {a,b)eS 



«ii'-^ii' 



<T* 



Recall that 
where 



m^\a,b) = 6f,^ - Tj{a,b) 



(x,y)G©,(a,6) ^ ' 



Vi^iVi) 



i=l i=l 



Note that J-j{a,b) has at most A-' terms (including the multinomial factor). Therefore, since the 

function 

J 

h{^)=l[. 



X 



v 



j=l 
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is continuously differentiable with bounded derivatives in [—M,M], there is C (depending on M, J) 
such that 

aj <T* + CAJ(2Sj_i), 

for small Sj_i. Then we have 

for some C* > depending on J, M, where we used cr2 ^ t*. 

So it suffices to have r* = {ujn^ /2)-i where cOn — > +00 as n ^ +00 arbitrarily slowly. By the 
last part of Proposition [H using a union bound over the O(n^) short pairs of leaves, it follows that 
k = CiOnl^'^ logn samples are enough to guarantee 



i-i-^'ll 



< (u;„,A-^'/^)"\ VI < j < J, V short pairs a,b\ > 1 - o(l), 



for some C depending on J, M. 

As for the computational complexity of the algorithm, assume first that the tree is represented in 
such a way that finding the set of edges on the path between two leaves a,b at distance 0(A) takes 
time 0(A) (this is easy in a rooted tree). Note that for each j, a, b the sum 



Tj{a,b) 



E 



(x,y)Gl?,(a,fe) ^^'^^ i=l 
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U^iflK-'^' 



W 



iVi) 

fi ■ 



i=l 



can be computed in time A . Since there are 0{n ) pairs of leaves, the total complexity is 0{A n 



J^2\ 



Similarly, in the general case, we get: 

Proposition 2 Let a,b,c£ L at graph distance less than 2A where A = A(T) is the depth ofT. Fix 
j € N. We have the following (where the constants depend on J and M only): 



1. There exists a constant C such that, VA > 0, 



ab\c 



E 



ab\c 



> A ) < 2 exp 



X^k 



CA^J' 



(13) 



2. There exists a constant C such that, if k > A'^, 



E 



^ab\c 



able 



<c 



AM Ay 



Vk 



(14) 



3. If further 



then we have 



^ab\c ^ab\c 



C'<^ < A, 
> 2a) < 2 exp 



\^k 



CA2i- 



(15) 



Proof Sketch: The proof is very similar to Proposition [TJ We only give a sketch. 

To prove 1., it is enough to consider four separate cases depending on which path segment (corre- 
sponding to Hi, if 2, H-^ and H^ in Figure [5]) we make the perturbation. 

To prove 2., note that we can write 



^ab\c 



k 

Y,{X^ + eiy~\Y, + en), 



(16) 
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with X, = [Dl - Dl) - 5«, y, = {Di -Dl- 6i])) + {Dl - D^- d^p), and 

en = iS^^J-& + isi'J-&. 

Also note that <t>)^u^ = ]E[X^~ Yi],\/i. Use the Bhiomial theorem to expand the expression in ()16p and 
write it as 

where the error term is 

Now use the fact that |^i| < 4MA, IKjI < 8 A/ A, and Part 2. of Proposition [T] to conclude that 

E[|K|1 < C'(^^ 

Part 3. now follows by combining Part 1. and 2.B 

Theorem 5 Let e > be arbitrarily small. If k = uj^A"^'^ logn), then after an application of ER, 
one has 



m - ^(i) 



< e, Ve E -E, VI < j < J 



>l-o(l), (17) 



as n ^ +00. The algorithm runs in time 0(A"^n^). 
Proof: The proof is identical to Theorem [H ■ 

6 Concluding Remarks 

1. We have assumed that delays are /initeZj/ supported This assumption is not essential. Unbounded 
distributions for which similar concentration inequalities can be obtained lead to the same results. 
For example, using [14^ Proposition 4.18], one can treat the case of Exponential and Gamma 
delays. 

2. It is an interesting problem, from a practical point of view, to improve the dependence of our 
results on J. 

3. It is somewhat intriguing that the reconstruction of the topology of the tree required the joint 
distributions on pairs of leaves whereas the reconstruction of delays (in the asymmetric case) 
required the joint distributions on triples of leaves. A similar situation holds in phylogenetics [Ji- 
lt could be interesting to prove that this is indeed necessary in some sense. 

4. Throughout, the model was assumed to be static. In real-life networks, characteristics of the 
network change over time. One could try to adapt our algorithm to a more dynamic setting. See 
for example [5] for a discussion of temporal issues. 
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A Examples of Regular Delay Distributions 

Below, we give two typical examples of families of distributions covered by our results. The first 
example is a set of continuous distributions with few parameters. The second example is a general 
discrete distribution. The latter is the main focus of |23| . 

Uniform distributions. Let Q = {Qejeee be the family of distributions where Qe is uniform on 
[0, 6] with G = [6_, 9] for some < 9< 6 < +oo. Let w^"^' be the estimated variance and define 



^(tl;(2)) = { 






if 12u;(2) < 02^ 

-2 



if 12u;(2) > 
12w^'^> , otherwise. 



Assume \w^'^^ 
that 



w 



(2)| <S = ^. From 02, 



Jo 



WQe - Q§\\i 

and assuming w.l.o.g. that 9 > 6 (the other case is symmetric 

1^.. 1 



-9){9 + 9), it follows easily that \9 - 9\ < f . Note 



i-x<e 



x<9 



dx, 



I 

Jo 



'-x<e 



"-xKO 



dx 



l-'-]+i9-9)'-<2^- 
ff 9 ^ '9- 



<e. 



Therefore, Q is (e, 2)-regular for any e > 0. 



Bounded discrete distributions. Let M be a positive integer and let [M] = {0, 1, . . . ,M}. Also, 
let < 6* < 1 and 



> . 



Q= {9 = {9o, 9i,...,9M):0<9i<l, Vi G [M], ^o > 9, and ^ i9i G [M] 

ielM] 

Denote by Q = {Qeleee the family of distributions on [M] such that X ^ Qg means 

]p[x = i]=9i, ViG[M]. 

The assumption on the mean of X in the definition of @ greatly simplifies the calculations below. It 
is a reasonable approximation in the standard practical case where Q is a discretization of continuous 
densities with a large number of bins M. The assumption on ^o simply indicates that the distribution 
has been translated to "start at 0." Define fi = E,[X] where X ^ Qq and let 9' = {9'_j^j, 9'_j^^^^, . . . , 9\.^) 
where 0'_,, = 9i for all i S [M] and otherwise. Note that the following holds 



i-~fj, 



M 



Y^ ii0'. = w^^\9), Vj G [2M + 1], 



-M 



or in matrix form A9' = w. From the Vandermonde structure of A it follows easily that det A > 1, 
that is, A~^ exists, and furthermore ||A~^||i is a strictly positive constant depending on 9,M. Let w 
be the estimate of w and let 9' = A^^w. Then, it follows that for any e > there is 5 > such that 



?'lli<l|A~'lli| 



w 



w||i < e. 



whenever ||w — w||oo < S. Assume further that e < 9/2, then we can recover an estimate 9 of 9 from 
9' such that ||^ — 0||i < e. Indeed, our assumptions above allow us to infer a distribution centered at 
which we then translate to start at 0. Therefore, Q is (e, 2M — l)-regular. Note that strictly speaking 
one should force all components of 9 to be in [0, 1] and renormalize appropriately. Details are omitted. 
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Bf){u,v) 



> a'r ? 



Figure 7: Illustration of routine Mini Contractor. 



B DMR Algorithm 

We shall now provide an outline of the DMR algorithm. The general DMR algorithm actually allows 
the user to build a "forest" when the number of samples is too small. We will not use this feature here 
and we therefore simplify the algorithm accordingly. The input to the algorithm is a (f, M)-distorted 
metric W on n leaves. In particular, we assume that the values f and M are known to the algorithm. 
We denote the true tree by T = (V, E). Take a,a' > and < /3, /3' < 1 such that 

6<a' + 3<a< (d)"S 

and 

{py^M + f<(3M< hp'M -3f]. 

(Here it is assumed that M = Lu{f).) The details of the subroutines Mini Contractor and Exten- 
der can be found in Figures [9] and [TOl The reader is referred to [8] for a detailed explanation of the 
algorithm — which is somewhat involved. In a nutshell, for each pair of leaves u, v that are not "too 
far": 1) the algorithm finds all edges sitting on the path between u and v (as illustrated in Figure [7]); 
2) then it derives the bipartitions corresponding to these edges by "extending" the bipartitions in a 
small ball around u,v (as illustrated in Figure [8]). 

• Pre-Processing: Proximity Test. Build the graph H/^ = (Vfl,£'^) where Vp = L and {u,v) S 
Ep ^^ W{u, v) < pM; 

• Main Loop. 

— For all pairs of leaves n, -y G Vg such that (u, v) £ Ep: 

* Mini Reconstruction. Compute 

{ipj{u,v)}j^f' := Mini CoNTRACTOR(if/5;M, u); 

* Bipartition Extension. Compute 



{lpj{u,v)}jl{ ' := EXTENDER{Hf3,{'lljj{u,v)}jl 



— Deduce the tree T from {i^j{u,v)Y-^l^ , 
Output. Return the resulting tree T. 
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Figure 8: Illustration of routine Extender. 



Algorithm Mini Contractor 
Input: Graph Hfj] Leaves w, u; 
Output: Bipartitions {tpj{u,v)}'^ji^{^ , 

• Ball. Let 



B 



^,\u,v) := i.w e Hp : Wiu,w) W Wiv,w) < (^' m\ 



5(0), 



• Intersection Points. For all w G B\,'{u,v), estimate the point of intersection between 
(distance from u), that is, 

- ( d{u, v) + d{u, w) — d{v, w) j ; 



$„ 



• Long Edges. Set S := B\, {u,v) ~ {u}, a;_i = u, j := 0; 

- Until 5* = 0: 

* Let xo — argmin{$uj : w £ S} (break ties arbitrarily); 

* If ^xo ~ ^x-i > ct'f, create a new edge by setting tpj^i{u,v) := {B^, {u,v) — S,S} and let 



* Else, set Cj :— Cj U {xq}; 

* Set S :— S ~ {xq}, x^i := xq; 

Output. Return the bipartitions {'ipj{u,v)Yjl^{ 






Figure 9: Algorithm Mini Contractor. 
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Algorithm Extender 

Input: Graph Hp; Bipartitions {ipj{u,v)}^-i^^^ ; Leaves u,v; 

Output: Bipartitions {'0j(w, u)}^l']'" ; 

• For 7 = 1,..., r{u, v) (unless r{u, v) = 0): 

— Initialization. Denote by i/;:"'(u,w) the vertex set containing u in the bipartition ipj{u,v), and 
similarly for v] Initialize the extended partition tpy^ {u,v) := tpj {u,v), f/JJ^ (u,w) := tp^^ {u,v); 

- Modifiei 
removed; 



Modified Graph. Let K be Hp where all edges between i/ij" (m, u) and ■0:" (u,w) have been 



— Extension. For all w € wL — {i/^j {u,v) U ^p^^ {u,v)), add w to the side of the partition it is 
connected to in K (by definition of K, each w as above is connected to exactly one side); 

Return the bipartitions {fpjiujV)}^,!^^" . 

Figure 10: Algorithm Extender. 
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